home *** CD-ROM | disk | FTP | other *** search
- Unit 7 High Availability Solutions
- Purpose
- This unit describes how HP's high-availability solutions can assist the
- high-end customer in maximizing data and system availability.
-
- Objectives
- At the end of this unit, you will be able to:
- o Explain why high availability is particularly important to high-end
- customers.
- o List the major high-availability solutions available from HP.
- o Describe why HP high-availability solutions are superior to
- competitive solutions.
- o Describe key points in HP's vision for the future of high
- availability.
-
- Introduction
- High availability is a critical aspect of systems management that
- ensures the consistent and dependable availability of both data and
- systems. High availability means that access to mission-critical data
- and applications is maximized by:
- o Minimizing unplanned down-time
- o Minimizing or eliminating planned down-time
- High-availability solutions protect a business's ability to function.
- This is particularly true with high-end customers because:
- o Minutes or hours of down-time may mean thousands of dollars in lost
- productivity, lost revenue, and increased expenses.
- o High-end configurations are larger, so the impact of unavailable data
- and resources affects a larger community of users.
- o The significant investment in hardware warrants a highly available
- environment.
- o Operations may be round-the-clock, and any unavailability has a
- significant effect.
- In high-end environments, system down-time cannot be tolerated.
-
- HP Solutions
- HP Strategy
- HP offers a range of products that provide increasing levels of high
- availability, enabling customers to tailor the best solutions for their
- environments. The focus is on minimizing both planned and/or unplanned
- down-time.
-
- [Figure: High Availability in the Data Center, caption: none]
-
- Key Messages
- HP continues to provide industry-leading hardware reliability with PA-
- RISC and VLSI technologies. Today, HP's Corporate Business Systems build
- on this foundation by providing built-in high-availability features in
- the hardware.
- HP's software is reliable and is subjected to rigorous quality
- testing. In addition, MPE/iX provides a comprehensive software
- resiliency strategy that anticipates software failures before they occur
- and takes action to further improve system up-time.
- In addition to high reliability, HP offers solutions to further
- enhance data integrity and availability, system availability, and fault
- and disaster tolerance. HP also has solutions to address the various
- causes of unplanned failures.
-
- [Figure: Causes of Unplanned Failures, caption: none]
-
- Highly Reliable Hardware and Software
- Reliable Hardware
- For the HP 3000 and HP 9000
- The HP 3000 and HP 9000 Corporate Business Systems incorporate high-
- availability features in their error correcting circuitry, memory
- arrays, I/O channels, and main processor memory bus. Easy
- deconfiguration of failed CPUs, memory, or I/O interface modules enable
- Corporate Business Systems to achieve a maximum hardware up-time 99.97%.
- Also included are standard features such as automatic system recovery
- with powerfail (battery backup).
-
- Reliable Software
- Both the MPE/iX and HP-UX operating systems have been designed to offer
- a superior level of software reliability.
-
- System Boot-Up
- During the system boot-up process, if an error occurs while mounting
- user volumes, MPE/iX and HP-UX will identify the problem volumes and
- continue to mount the others. The systems also provide auto
- configuration at system startup.
- For the HP 3000
- MPE/iX is solid even under heavy loading and is a bullet-proof operating
- environment.
-
- Try/Recover Routines
- For example, MPE/iX Release 4.0 has been protected in 24 additional ways
- from system failures. In addition, try/recover routines, which have
- always existed in MPE/iX, have been implemented in even more places for
- even higher resiliency. Try/Recover routines enable MPE/iX to recover
- from error conditions without failing the system. This ongoing
- commitment to software excellence makes MPE/iX the most robust
- commercial operating system in the industry, next to MVS/ESA.
-
- Table Monitor
- Another bullet-proof capability comes with the use of a Table Monitor.
- This feature, available in Q4CY92, monitors table usage and proactively
- responds to prevent tables from being exceeded. This feature, combined
- with new larger table sizes, will contribute to making the HP 3000 even
- more available to users on a daily basis.
-
- Aggregate Parallel Recovery
- MPE/iX Release 4.0 incorporates a new feature called Aggregate Parallel
- Recovery (APR). This enables system and user volumes to be recovered in
- parallel rather than serially. This speeds up the system bootup process,
- increasing system up-time.
- For the HP 9000
- HP-UX is the most mature commercial UN*X operating environment in the
- industry. Extensive testing is done on all major releases to ensure
- reliability and resiliency.
-
- Data Integrity and Availability
- Both the HP 3000 and HP 9000 offer a variety of Data Integrity
- solutions.
-
- For the HP 3000 and HP 9000
- RAID
- The use of Redundant Arrays of Inexpensive Disks (RAID), or parity disk
- arrays, provide added measures of data availability and recovery in the
- event of a disk failure. The use of disk arrays is very common in high-
- end environments.
-
- [Figure: High Data Availability, caption: none]
-
- Disk Mirroring
- Software disk mirroring further increases the availability of disks
- beyond disk arrays. A disk array will not remain available if the array
- controller, interface card, or power supply fails. With mirroring, data
- is also available from another array with a functioning controller,
- interface card, or power supply.
-
- SCSI Mirroring
- Mirroring of SCSI disk drives is supported on MPE/iX 4.0 and HP-UX 9.0.
- For the HP 3000
- MPE/iX Transaction Manager
- MPE/iX features an integrated built-in transaction manager which
- automatically and transparently provides data integrity for databases,
- indexed (keyed) sequential access method (KSAM) files, system and file
- system directories, and other critical system tables.
- In the event of a system failure or extended power outage, the
- contents of memory, databases, files, and critical system tables are
- automatically and transparently restored to a state of data integrity by
- the transaction monitor.
-
- NetBase Shadow
- Disk mirroring and disk sharing across a network are now available with
- the NetBase Shadow feature, enhancing data availability on the HP 3000.
- This feature differs from Mirrored Disk/iX in that it can mirror to more
- than 12 sites, and across a WAN. Any mirrored set of disks can be used
- for on-line backup purposes.
-
- For the HP 9000
- Logical Volume Manager
- Because of HP's strong commitment to standards, OSF's Logical Volume
- Manager (LVM) is supported on HP-UX. LVM mirroring maintains up to three
- copies of data on separate disks. Disks in a mirrored pair or triplet
- can be taken off-line for backup while applications continue to access
- data on-line. LVM also provides the capability for files (maximum 2
- Gbytes) to span multiple physical volumes, improving performance and
- availability.
-
- System Availability
- For the HP 3000 and HP 9000
- SwitchOver
- Response to a failure at the system level should be automatic,
- quickly returning operations back to normal. HP's solutions, such as SPU
- SwitchOver/iX for MPE/iX and SwitchOver/UX for HP-UX, provide near-
- continuous operation of mission-critical computing environments.
- SwitchOver/UX provides for automatic fault detection and recovery of the
- failed SPU.
-
- HP Support Watch HP Predictive Support
- HP also provides services to minimize unplanned system down-time. HP
- Support Watch for HP-UX and HP Predictive Support for MPE/iX monitor the
- system to proactively detect and report hardware faults to the Response
- Center and system operator before they cause a failure.
-
- For the HP 3000
- AutoRestart/iX
- With MPE/iX release 4.0, AutoRestart/iX (included with HP 3000 CS DX)
- has been enhanced. Compression is available to reduce the amount of disk
- space needed for a dump, and the need to dedicate an entire disk for
- dumps has been eliminated. In addition, a toggle has been added to
- enable customers to turn the autoboot feature in AutoRestart/iX on or
- off.
-
- Disaster Tolerance
- Natural disasters that interrupt the availability of systems and data
- can ruin an entire business. High-end data center managers are under
- pressure to have disaster recovery plans.
-
- For the HP 3000
- NetBase
- NetBase for MPE/iX offers wide-area disaster recovery. By automatically
- maintaining copies of data throughout a geographically dispersed
- network, NetBase Shadowing ensures the availability of the data in the
- event of a natural disaster. If a system on the network should go down,
- one command can redirect file access to an alternate computer, bringing
- an unavailable application back on-line in a very short period of time.
- For the HP 3000 and HP 9000
-
- Disaster Recovery
- HP also offers Disaster Recovery Services to allow customers to plan for
- a natural disaster. In the event of a disaster, the HP Backup service
- provides around-the-clock access to fully operational configurations.
-
- [Figure: US map, caption: none]
-
- Competition
- Reliable Hardware and Software
- Although IBM, competing againstIBM's 3090/ES9000 is perceived to be
- extremely reliable, HP's systems provide higher availability due to a
- simpler design with fewer parts (the HP 3000 has won DataPro's
- reliability rating for years). Water cooling on IBM's 3090, for example,
- introduces additional points of failure. Since HP systems are air-
- cooled, these points of failure do not exist.
- MVS/ESA, as software, provides a high degree of fault resilience to
- system failure. In the event of a near-fatal fault, software errors are
- trapped via Functional Recovery Routines and are worked around
- dynamically.
- MVS/ESA offers more granularity in the way it shuts down, but
- applications must be specifically coded to take advantage of it. For
- example, IBM recommends dedicating one database per application. If the
- database fails, only that one application is unavailable. However, it
- creates additional programming work and overhead to enable the databases
- to interact.
- MPE/iX Release 4.0 incorporates many new software resiliency features
- to enhance system up-time. It is the most reliable commercial operating
- system next to MVS/ESA. HP-UX is the most mature and reliable UNIX
- operating environment in the industry.
-
- Data Integrity and Availability
- Both HP's MPE/iX and IBM's MVS/ESA provide a high degree of data
- integrity through facilities such as transaction logging. Transaction
- logging allows for data to be recovered from a log file in the event of
- a "soft" failure (using rollback recovery), or a "hard" failure (using
- rollforward recovery).
- MPE/iX has a key advantage in that it provides this functionality as
- an integral part of the operating system. Complete transaction
- management occurs with all applications transparently. MVS/ESA provides
- this functionality however, it is only implemented via its various OLTP
- teleprocessing (TP) monitors, CICS, or IMS/DC. OLTP applications are
- dependent on these additional subsystems for transaction logging. While
- virtually all MVS/ESA environments utilize these subsystems, having to
- do so adds to the complexity (and cost) of the system they must manage.
- MPE/iX provides the most complete, transparent, inherent transaction
- manager for consistent data integrity.
- MPE/iX is the most robust commercial operating system in the industry
- next to MVS.
- HP and IBM are on a parity level with regard to their disk mirroring
- and disk array functionality. (In fact, Storage Tek uses HP's disk
- technology in its "Iceberg" intelligent storage subsystems. HP is
- investigating Iceberg support for a later date.) MPE/iX has an
- advantage over IBM in being able to mirror disks across a WAN with
- NetBase.
-
- System Availability
- IBM and HP are on a parity level with regard to SPU switchover
- functionality. HP is also on a parity level with DEC clusters. HP has an
- advantage over the AS/400 with regard to system availability for several
- reasons. The AS/400 has no transaction manager. After a system failure,
- disks must be reloaded, otherwise the data is not recovered. This can
- take more than 8 hours! In addition, the AS/400 does not have a
- switchover product.
-
- Disaster Tolerance
- The HP 3000 has an advantage over DEC, competing againstDEC and IBM with
- regard to wide-area disaster tolerance. IBM cannot mirror disks over a
- WAN, so a remote IBM system cannot take over for a failed primary SPU.
- DEC has the same problem. And DEC VAXclusters don't provide the WAN
- feature required for disaster recovery (because the cluster must reside
- at one location).
-
- HP's Vision for the Future
- HP will incorporate new hardware and software design features to provide
- even greater availability in the future and further reduce the potential
- for system failure. These features also will help minimize or eliminate
- planned down-time for system maintenance.
-
- [Figure: The Drive Towards Continuous Availability, caption: none]
-
- Planned hardware enhancements include:
- o On-line replacement for critical elements, including datacomm links,
- I/O buses, and I/O interfaces
- o On-line CPU and memory replacement for future HP multiprocessing
- systems
- o Hardware resiliency features that prevent component failures from
- causing a hard crash
- o "Memory de-allocation" to allow a memory board to detect correctable
- errors that usually precede a failure and take itself off-line.
- o "Graceful degradation" to enable a processor board (in a multi-
- processing system) to anticipate its own failure and take itself off-
- line without bringing the system down
- Software enhancements will include "bullet-proofing" all MPE/iX
- subsystem applications to further enhance software resiliency. Software
- resiliency features are also planned for future releases of HP-UX to
- reduce the root cause of system panics.
-
- Index
- Aggregate Parallel Recovery (APR) 7-5
- AutoRestart/iX 7-7
- DEC, competing against 7-9
- Disaster Recovery Services 7-8
- Disk mirroring 7-6
- IBM, competing against 7-8
- Logical Volume Manager 7-7
- MPE/iX 7-3, 7-5
- NetBase 7-8
- NetBase Shadow 7-7
- Predictive Support 7-7
- Redundant Arrays of Inexpensive Disks (RAID) 7-5
- SPU SwitchOver/iX 7-7
- Support Watch 7-7
- SwitchOver/UX 7-7
- Table Monitor 7-5
- Transaction manager 7-6
-
- Associated files: U2-06.HPG, WBDB03.GAL, WBDB02.GAL, WBDB01.GAL,
- WBDB04.GAL, U2-06.HPG, WBDB03.HGL, WBDB02.HGL, WBDB01.HGL, WBDB04.HGL,
- 7.doc
- Unit 7 High Availability Solutions
-